Novel low-overhead roll-forward recovery scheme for distributed systems
نویسندگان
چکیده
An efficient roll-forward checkpointing/recovery scheme for distributed systems has been presented. This work is an improvement of our earlier work. The use of the concept of forced checkpoints helps to design a single phase non-blocking algorithm to find consistent global checkpoints. It offers the main advantages of both the synchronous and the asynchronous approaches, that is simple recovery and simple way to create checkpoints. The algorithm produces reduced number of checkpoints. Since each process independently takes its decision whether to take a forced checkpoint or not, it makes the algorithm simple, fast and efficient. The proposed work offers better performance than some noted existing works. Besides, the advantages stated above also ensure that the algorithm can work efficiently in mobile computing environment.
منابع مشابه
Cache based fault recovery for distributed systems
No cache based techniques for roll forward fault re covery exist at present A split cache approach is pro posed that provides e cient support for checkpointing and roll forward fault recovery in distributed systems This approach obviates the use of discrete stable stor age or explicit synchronization among the processors Stability of the checkpoint intervals is used as a driver for real time op...
متن کاملDistributed Recovery Units: An Approach for Hybrid and Adaptive Distributed Recovery
Traditionally, distributed recovery schemes have been designed for systems consisting of multiple recovery units. Each recovery unit (RU) resides on a single processor and it can fail and recover as a whole. This report introduces the \distributed recovery unit (DRU)" abstraction as an approach for design of \hybrid" and \adaptive" recovery schemes for distributed systems. The distributed syste...
متن کاملAnother Two - Level Failure Recovery Scheme : Performance
This report deals with the design and evaluation of a \two-level" failure recovery scheme for distributed systems. In our previous work 30, 32], we motivated a \two-level" recovery approach that tolerates the more probable failures with a low overhead, and less probable failures with possibly higher overhead. The two-level approach can achieve a smaller overhead as compared to traditional recov...
متن کاملRoll-Forward and Rollback Recovery: Performance-Reliability Trade-Off
Trade-O Dhiraj K. Pradhan Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 fpradhan,[email protected] Abstract Performance and reliability achieved by a modular redundant system depend on the recovery scheme used. Typically, gain in performance using comparable resources results in reduced reliability. Several highperformance computers are not...
متن کاملAn Improved Logging and Checkpointing Scheme for Recoverable Distributed Shared Memory
The distributed shared memory(DSM) system transforms an existing network of workstations to a powerful shared-memory parallel computer which could deliver superior price/performance. However, with more workstations engaged in the system and longer execution time, the probability of faults increases which could render the system useless. Several checkpointing and logging schemes have been propos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IET Computers & Digital Techniques
دوره 1 شماره
صفحات -
تاریخ انتشار 2007